Generating Component-based Supervised Learning Programs From Crowdsourced Examples
نویسندگان
چکیده
We present CrowdLearn, a new system that processes an existing corpus of crowdsourced machine learning programs to learn how to generate e�ective pipelines for solving supervised machine learning problems. CrowdLearn uses a probabilistic model of program likelihood, conditioned on the current sequence of pipeline components and on the characteristicsof the inputdata to thenextcomponent in thepipeline, to predict candidate pipelines. Our results highlight the e�ectivenessof this technique in leveragingexistingcrowdsourced programs to generate pipelines that work well on a range of supervised learning problems.
منابع مشابه
Predicting a Correct Program in Programming By Example
We study the problem of efficiently predicting a correct program from a large set of programs induced from few input-output examples in Programming-by-Example (PBE) systems. This is an important problem for making PBE systems usable so that users do not need to provide too many examples to learn the desired program. We first characterize the three main types of expressions used for expression s...
متن کاملPredicting a Correct Program in Programming by Example
We study the problem of efficiently predicting a correct program from a large set of programs induced from few input-output examples in Programming-byExample (PBE) systems. This is an important problem for making PBE systems usable so that users do not need to provide too many examples to learn the desired program. We first formalize the two classes of sharing that occurs in version-space algeb...
متن کاملWised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge
The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...
متن کاملWised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge
The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...
متن کاملImproving Quality of Crowdsourced Labels via Probabilistic Matrix Factorization
Quality assurance in crowdsourced annotation often involves having a given example labeled multiple times by different workers, then aggregating these labels. Unfortunately, the worker-example label matrix is typically sparse and imbalanced for two reasons: 1) the average crowd worker judges few examples; and 2) few labels are typically collected per example to reduce cost. To address this miss...
متن کامل